# Prior-Aware Decoding

Accompanying code for the paper "Mitigating the Influence of Distractor Tasks in LMs with Prior-Aware Decoding", anonymized for conference submission.

The code will be open-sourced upon publication.

## Running instructions

Use Poetry package manager to create a virtual environment and install project's Python dependencies. Then open and run `main-evaluation.ipynb` in Jupyter Lab.

```
poetry install
poetry run jupyter lab
# Open the link shown and run main-evaluation.ipynb, resp main-evaluation-2models.ipynb
```

Note that you may need to update the project file to use the version of pytorch appropriate for your GPU (the included version is CPU-only as a default).

All the calls to the language models are cached, the computation can be interrupted and resumed (do backup the `CACHE_*` files, though).
Note you will need your OpenAI API key for evaluating the OpenAI models.

## Strong local prior dataset

A dataset of strong local priors based on minor alternations to very common sequences (e.g. writing the alphabet skipping a given letter) is generated by `strong-local-priors-gen.py`.
The dataset exists in two variants: Classification (also used in the paper) `strong-local-priors_classification.jsonl` and sequence completion `strong-local-priors_sequence-probability.jsonl`.
Both files in the format of Inverse Scaling Problems dataset.
